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Abstract 



The diffusion of ideas is often closely connected to the creation 
and diffusion of knowledge and to the technological evolution of soci- 
ety. Because of this, knowledge creation, exchange and its subsequent 
transformation into innovations for improved welfare and economic 
growth is briefly described from a historical point of view. Next, three 
approaches are discussed for modeling the diffusion of ideas in the ar- 
eas of science and technology, through (i) deterministic, (ii) stochastic, 
and (iii) statistical approaches. These are illustrated through their 
corresponding population dynamics and epidemic models relative to 
the spreading of ideas, knowledge and innovations. 

The deterministic dynamical models are considered to be appro- 
priate for analyzing the evolution of large and small societal, scientific 
and technological systems when the influence of fluctuations is insignif- 
icant. Stochastic models are appropriate when the system of interest 
is small but when the fluctuations become significant for its evolu- 
tion. Finally statistical approaches and models based on the laws and 
distributions of Lotka, Bradford, Yule, Zipf- Mandelbrot, and others, 
provide much useful information for the analysis of the evolution of 
systems in which development is closely connected to the process of 
idea diffusion. 
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10 IMPORTANT QUESTIONS 


AND THEIR ANSWERS 


RAISED IN THIS CHAPTER 


in the form of guidance 


1. What is the connection between 


Knowledge is often considered 


knowledge and capital? 


as a form of human capital 


2. What happens in the case of 


Knowledge is transferred 


knowledge diffusion? 


when the subjects interact 


3. Should quantitative research be 


Yes, surely supplemented 


supplemented by qualitative research? 


coordinated joint aims are useful 


4. Who are the pioneers of 


Alfred Lotka and 


scientometrics? 


Derek Price 


5. What is the relation between epidemic 


Epidemic models are a 


models and of 


particular case of 


population dynamics models? 


population dynamics models 


6. What has to be done if 


Switch from deterministic 


fluctuations strongly influence 


to stochastic models 


the system evolution? 


and think 




Often data is collected 


7. Why are discrete models useful? 


for some period of time. Thus, such data is 
best described by discrete models 


8. Around which statistical law are 




grouped all statistical tools 


Around Lotka law 


described in the chapter? 




9. Are all possibly relevant models, 


NO ! Only an appropriate selection. 
For more models, consult the literature 


presented in this chapter? 


or ask a specialist 




Proceed from simple to more 


10. What is the strategy followed 


complicated models and from deterministic 


by the authors of the chapter? 


to stochastic models supplemented 
by statistical tools 



Table 1: Several questions and answers that should guide and supply useful 
and important information for the reader. 

1 Knowledge, capital, science research, and 
ideas diffusion 

1.1 Knowledge and capital 

Knowledge can be defined as a dynamic framework connected to cognitive 
structures from which information can be sorted, processed and understood 
PQ . Along economics lines of thought [21 El S] , knowledge can be treated as 
one of the "production factors", - i.e., one of the main causes of wealth in 
modern capitalistic societies. 
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Models described in this chapter 


ARE USEFUL FOR 


Science landscapes 


Evaluation of research strategies. 
Decisions about personal development 
and promotion 


Verhulst 
Logistic curve 


Description of a large class of 
growth processes 


Broadcasting model 
of technology diffusion 


Understanding the influence of mass 
media on technology diffusion 


Word-of-mouth model 


Understanding the influence of 
interpersonal contacts on 
technology diffusion 


Mixed information source model 


Understanding the influence of both mass 
media and interpersonal contacts on 
technology diffusion 


Lotka-Volterra model 
of innovation diffusion 
with time lag 


Understanding the influence of the time lag 
between hearing about innovation and 
its adoption 


Price model of knowledge 
growth with time lag 


Modeling the growth of discoveries, 
inventions, and scientific laws 


SIR models of scientific 
epidemics 


Modeling the epidemic stage of 
scientific idea spreading 


SEIR models of scientific 
epidemics 


Extends the SIR model 
by specifically adding the role 
of a class of scientists exposed to 
some scientific idea 



Table 2: List of models described in the chapter with comments on their 
usefulness. 

According to Marshall [5] a " capital" is a collection of goods external to 
the economic agent that can be sold for money and from which an income can 
be derived. Often, knowledge is parametrized as such a "human capital" 
[6j [7J [3 [9j [10]. Walsh jTTJ was one pioneer in treating human knowledge 
as if it was a "capital", in the economic sense; he made an attempt to find 
measures for this form of "capital". Bourdieu [12] , Coleman JT3], Putnam 
|14j . Becker and collaborators have further implanted the concept of such a 
"human capital" in economic theory [T5 | fl6 l ITT]. 

However, the concept of knowledge as a form of capital is an oversim- 
plification. This global-like concept does not account for many properties 
of knowledge strictly connected to the individual, such as the possibility for 
different learning paths or different views, multiple levels of interpretation, 
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Models described in this chapter 


ARE USEFUL FOR 


Discrete model for 
the change in the number of 
authors in a scientific field 


Modeling and forecasting 
the evolution in the number of 
authors and papers in a scientific field 


Daley model 


Modeling the evolution of a population 
of papers in a scientific field 


Coupled discrete model 
for populations of 
scientists and papers 


Modeling and forecasting the joint 
evolution of population of scientists 
and papers in a research field 


Goffman-Newill model 
for the joint evolution of 
one scientific field and one 
of its sub-fields 


Epidemic model for the increase of 
number of scientists from a 
research field who start work 
in a sub-field of the scientific field. 
The model also describes 
the increase in the number of papers in 
the research sub-field 


Bruckner- Ebeling-Scharnhorst 
model for the evolution of n 
scientific fields 


Understanding the joint evolution 
of scientific fields in presence 
of migration of scientists from 
one field to another field 



Table 3: List of models described in the chapter with comments on their 
usefulness (Continuing Table 2). 

and different preferences [IS]. In fact, knowledge develops in a quite com- 
plex social context, within possibly different frameworks or time scales, and 
involves "tacit dimensions" (beside the basic space and time dimensions) 
requiring coding and decoding [I]. 



FOR POLICY-MAKERS 

Take away box Nr.l: Knowledge is much more than a form of capital: it 
is a dynamic framework connected to cognitive structures from which 
information can be sorted, processed and understood. 



1.2 Growth and exchange of knowledge 

Science policy-makers and scholars have for many decades wished to develop 
quantitative methods for describing and predicting the initiation and growth 
of science research [TjJl [201 EE]- Thus, scientometrics has become one of 
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Models described in this chapter 


ARE USEFUL FOR 


SI model for the probability of 
intellectual infection 


Modeling the spread of intellectual 
infection along a scientific network 


SEI model for the probability of 
intellectual infection 


Modeling the spread of intellectual infection 
along a scientific network in 
the presence of a class of scientists 
exposed to the intellectual infection 


Stochastic evolution model 


Modeling the number of scientists in a 

research subfield as a stochastic 
variable described by a master equation 


Stochastic model of 
scientific productivity 


Modeling the influence of fluctuations 
in scientific productivity 
through differential equations for 
the dynamics of a scientific community 


Model of competition 
between ideologies 


Understanding the competition between 
ideologies with possible 
migration of believers 


Reproduction-transport 
model 


Modeling the change of research field 
as a migration process 



Table 4: List of models described in the chapter with comments on their 
usefulness (Continuation of Table 2). 

the core research activities in view of constructing science and technology 
indicators [22]. 

The accumulation of the knowledge in a country's population arises either 
from acquiring knowledge from abroad or from internal engines [231 [2H EH 
|2"E] . The main engines for the production of new knowledge in a country are 
usually: the public research institutes, the universities and training institutes, 
the firms, and the individuals [27]. The users of the knowledge are firms, 
governments, public institutions (such as the national education, health, or 
security institutions), social organizations, and any concerned individual. 
The knowledge is transferred from producers to the users by dissemination 
that is realized by some flow or diffusion of process [28], sometimes involving 
physical migration. 

Knowledge typically appears at first as purely tacit: a person "has" an 
idea [2H1 EU]. This tacit knowledge must be codified for further use; after 
codification, knowledge can be stored in different ways, as in textbooks or 
digital carriers. It can be transferred from one system to another. In addition 
to knowledge creation, a system can gain knowledge by knowledge exchange 
and/or trade. 
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Laws described in this chapter 


ARE USEFUL FOR 


Lotka law 


Describing the number distribution 
of scientists with respect to 
the number of papers they wrote 


Pareto distribution 


Writing 
a continuous version 
of Lotka law 


Zipf law 

and 

Zipf-Mandelbrot law 


Ranking scientists 
by the number of papers 
they wrote 


Bradford law 


Reflecting the fact that a large number 
of relevant articles are concentrated 
in a small number of journals 



Table 5: List of laws discussed in the chapter with a few words on their 
usefulness (Continuation of Table 2). 

In knowledge diffusion, the knowledge is transferred while subjects inter- 
act j2D E2J Pioneering studies on knowledge diffusion investigated the 
patterns through which new technologies are spread in social systems [3H |35] . 
The gain of knowledge due to knowledge diffusion is one of the keys or leads 
to innovative products and innovations [36| [37]. 



FOR POLICY-MAKERS 
Take away box Nr. 2: 

An innovative product or a process is new for the group of people who are 
likely to use it. Innovation is an innovative product or process that has passed 
the barrier of user adoption. Because of the rejection by the market, many 
innovative products and processes never become an innovation. 



In science, the diffusion of knowledge is mainly connected to the transfer 
of scientific information by publications. It is accepted that the results of 
some research become completely scientific when they are published [38J. 
Such a diffusion can also take place at scientific meetings and through oral 
or other exchanges, sometimes without formal publication of exchanged ideas 
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FOR POLICY-MAKERS 
Take away box Nr. 3: 

Scientific communication has specific features. For example, citations are 
very important in the communication process as they place corresponding 
research and researchers, mentioned in the scientific literature, in a way sim- 
ilar to the kinship links that tie persons within a tribe. Informal exchanges 
happening in the process of common work at the time of meetings, work- 
shops, or conferences may accelerate the transfer of scientific information, 
whence the growth of knowledge 



2 Qualitative research. Historical remarks. 

2.1 Science landscapes 

Understanding the diffusion of knowledge requires research complementary 
to mathematical investigations. For example, mathematics cannot indicate 
why the exposure to ideas leads to intellectual epidemics. Yet, mathematics 
can provide information on the intensity or the duration of some intellectual 
epidemics. 

Qualitative research is all about exploring issues, understanding phenom- 
ena, and answering questions [40J without much mathematics. Qualitative 
research involves empirical research through which the researcher explores re- 
lationships using a textual methodology rather than quantitative data. Prob- 
lems and results in the field of qualitative research on knowledge epidemics 
will not be discussed in detail here. However, through one example it can be 
shown how mathematics can create the basis for qualitative research and de- 
cision making. This example is connected to the science landscape concepts 
outlined here below. 

The idea of science landscapes has some similarity with the work of Wright 
[UJ in biology who proposed that the fitness landscape evolution can be 
treated as as optimization process based on the roles of mutation, inbreed- 
ing, crossbreeding, and selection. The science landscape idea was developed 
by Small [I2l E3] , as well as by Noyons and van Raan [H] . In this frame- 
work, Scharnhorst jl5l H6] proposed an approach for the analysis of scientific 
landscapes, named "geometrically oriented evolution theory". 
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FOR POLICY-MAKERS 
Take away box Nr. 4: 

The concept of science landscape is rather simple: Describe the correspond- 
ing field of science or technology through a function of parameters such as 
height, weight, size, technical data, etc. Then a virtual knowledge landscape 
can be constructed from empirical data in order to visualize and understand 
innovation and to optimize various processes in science and technology. 



As an illustration at this level, consider that a mathematical example of a 
technological landscape can be given by a function C = C(S,v), where C is 
the cost for developing a new airplane, and where S and v represent the size 
and velocity of the airplane. 

Consider two examples concerning the use of science landscapes for eval- 
uation purposes: 

(1) Science landscape approach as a method for evaluating na- 
tional research strategies 

For example, national science systems can be considered as made of re- 
searchers who compete for scientific results, and subsidies, following optimal 
research strategies. The efforts of every country become visible, compara- 
ble and measurable by means of appropriate functions or landscapes: e.g., 
the number of publications. The aggregate research strategies of a country 
can thereby be represented by the distribution of publications in the various 
scientific disciplines. In so doing, within a two-dimensional space]]] different 
countries correspond to different landscapes. Various political discussions 
can follow and evolution strategies can be invented thereafter. 

Notice that the dynamics of self-organized structures in complex systems 
can be understood as the result of a search for optimal solutions to a certain 
problem. Therefore, such a comment shows how rather strict mathemati- 
cal approaches, not disregarding simulation methods, can be congruent to 
qualitative questions. 

(2) Scientific citations as landscapes for individual evaluation 
Scientific citations can serve for constructing landscapes. Indeed, citations 
have a key position in the retrieval and valuation of information in scientific 
communication systems [15], H7J EE]. This position is based on the objective 

1 E.g., take the scientific disciplines and the number of publications as axes 
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nature of the citations as components of a global information system, as 
represented by the Science Citation Index. A landscape function based on 
citations can be defined in various ways. It can take into account self-citations 
[i9| |50~1 |5T| [52] . or time-dependent quantitative measures [531 EH 155] - 



FOR POLICY-MAKERS 
Take away box Nr. 5: 

Citation landscapes become important elements of a science policy (e.g., 
in personnel management decisions), thereby influencing individual scientific 
careers, evaluation of research institutes, and investment strategies. 



2.2 Lotka and Price: pioneers of scientometrics 

Alfred Lotka, one of the modern founders of population dynamics studies, 
was also an excellent statistician. He discovered [57] a distribution for the 
number of authors function of the number of published papers r, - 

i.e., n r = rii/r 2 . 

However, Derek Price, a physicist, set the mathematical basis in the field 
of measuring scientific research in recent times [581 EHl [60]. He proposed a 
model of scientific growth connecting science and time. In the first version 
of the model, the size of science was measured by the number of journals 
founded in the course of a number of years. Later, instead of the number of 
journals, the number of published papers was used as the measure of scientific 
growth. Price and other authors [591 EH ETJ considered also different indica- 
tors of scientific growth, such as the number of authors, funds, dissertation 
production, citations, or the number of scientific books. 

In addition to the deterministic approach initiated by Price, the statis- 
tical approach to the study of scientific information developed rapidly and 
nowadays is still an important tool in scientometrics [621163] . More discussion 
on the statistical approach will be given in section 6 of this chapter. 
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FOR POLICY-MAKERS 
Take away box Nr. 6: 

Price distinguished three stages in the growth of knowledge: (a) a preliminary 
phase with small increments; (b) a phase of exponential growth; (c) a 
saturation stage. The stage (c) must be reached sooner or later after the 
new ideas and opportunities are exhausted; the growth slows down until a 
new trend emerges and gives rise to a new growth stage. According to Price, 
the curve of this growth is a S-shaped logistic curve. 



2.3 Population dynamics and epidemic models of the 
diffusion of knowledge 




Figure 1: Relation among epidemic models, Lotka-Volterra models, and pop- 
ulation dynamics models. 

Population dynamics is the branch of life sciences that studies short- and 
long-term changes in the size and age composition of populations, and how 
the biological and environmental processes influence those changes. In the 
past, most models for biological population dynamics have been of interest 
only in mathematical biology [6H |65] . Today, these models are adapted and 
applied in many more areas of science [661 EZ] • Here below, models of knowl- 
edge dynamics will be of interest as bases of epidemic models. Such models 
are nowadays used because some stages of idea spreading processes within a 
population (e.g, of scientists), possess properties like those of epidemics. 

The mathematical modeling of epidemic processes has attracted much 
attention since the spread of infectious diseases has always been of great 
concern and considered to be a threat to public health [68j EU [70] . In the 
history of science and society, many examples of ideas spreading seem to 
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occur in a way similar to the spread of epidemics. Examples of the former 
field pertain to the ideas of Newton on mechanics and the passion for " High 
Critical Temperature Superconductivity" at the end of the twentieh century. 
Examples of the latter field are the spreading of ideas from Moses or Buddha 
|71j . or discussions based on the Kermack-McKendrick model [72] for the 
epidemic stages of revolutions or drug spreading [73j. 

Epidemic models belong to a more general class of Lotka-Volterra models 
used in research on systems in the fields of biological population dynamics, 
social dynamics, and economics. The models can also be used for describing 
processes connected to the spread of knowledge, ideas and innovations (see 
Fig. 1). Two examples are the model of innovation in established organiza- 
tions [71] and the Lotka-Volterra model for forecasting emerging technologies 
and the growth of knowledge [36] . In social dynamics, the Lanchester model 
of war between two armies can be mentioned, a model which in the case of 
reinforcements coincides with the Lotka-Volterra- Gause model for competi- 
tion between two species [75J. Solomon and Richmond [TBI [77] applied a 
Lotka-Volterra model to financial markets, while the model for the trap of 
extinction can be applied to economic subjects [78]. Applications to chaotic 
pairwise competition among political parties [79] could also be mentioned. 

To start the discussion of population dynamics models as applied to the 
growth of scientific knowledge with special emphasis on epidemic models, two 
kinds of models can be discussed (Fig. 2): (1) deterministic models, see 
Sec. 3, appropriate for large and small populations where the fluctuations 
are not drastically important, (2) stochastic models, see Sec. 4, appropri- 
ate for small populations. In the latter case the intrinsic randomness appears 
much more relevant than in the former case. Stochastic models for large pop- 
ulations will not be discussed. The reason for this is that such models usually 
consist of many stochastic differential equations, whence their evolution can 
be investigated only numerically. 

Finally, let us mention that the knowledge diffusion is closely connected 
to the structure and properties of the social network where the diffusion 
happens. This is a new and very promising research area. For example, a 
combination can be made between the theory of information diffusion and 
the theory of complex networks [80]. For more information about the relation 
between networks and knowledge, see the following chapters of the book. 
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Fluctuations 



not 






important 


Section 3 


Section 3 


important 


Section 4 


not 






discussed 




small 


large 



System 
size 



Figure 2: Relationships between system size, influence of fluctuations, and 
discussed classes of models. 

3 Deterministic models 

Below, 13 selected deterministic models (see Fig. 3) are discussed. The 
emphasis is on models that can be used for describing the epidemic stage of 
the diffusion of ideas, knowledge, and technologies. 

3.1 Logistic curve and its generalizations 

In a number of cases, the natural growth of autonomous systems in competi- 
tion can be described by the logistic equation and the logistic curve (S-curve) 
|81j . In order to describe trajectories of growth or decline in socio-technical 
systems, one generally applies a three-parameter logistic curve: 

w « = i +a JL-fl (1) 

where N(t) is the number of units in the species or growing variable to study; 
K is the asymptotic limit of growth; a is the growth rate which specifies the 
"width" of the S-curve for N(t); and (3 specifies the time t m when the curve 
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Daley 
model for 

the 
population 
of papers 



Logistic 
curve 



Discrete 




models 


~+ — 



Deterministic 




Continuous 




models 




models 


— *~ 



Models with I 
time lag 



Mixed 
information 
source 
model 



Models 




Price model 


without time 




with time lag 


lag 








Lotka- 
Volterra 
model of 
innovation 
diffusion 



Figure 3: Discrete (3) and continuous (10) models discussed in the chapter. 
Two continuous models account for the influence of time lag, three models 
are simple models of technological diffusion. Two models are simple epidemic 
models and two models are more complicated models. In addition, the basic 
logistic curve is discussed. 



reaches the midpoint of the growth trajectory, such that N(t m ) = 0.5 K. The 
three parameters, K, a, and (3, are usually obtained after fitting some data 
|82j . It is well known that many cases of epidemic growth can be described 
by parts of an appropriate S-curve. As an example, recall that the S-curve 
was also used for describing technological substitution [3U |S3J [HI], ca. 60 
years ago. 

However, different interaction schemes can generate different growth pat- 
terns for whatever system species are under consideration [85] . Not every 
interaction scheme leads to a logistic growth [86J. The evolution of systems 
in such regimes may be described by more complex curves, such as a combi- 
nation of two or more simple three-parameter functions [811 187] . 
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3.2 Simple epidemic and Lotka-Volterra models of tech- 
nology diffusion 



As recalled here above, the simplest epidemic models could be used for de- 
scribing technology diffusion, like considering two populations/species: adopters 
and non-adopters of some technology. Such models can be put into two ba- 
sic classes: either broadcasting (Fig. 4) or word-of-mouth models (Fig. 5). 
In the broadcasting models, the source of knowledge about the existence 
and/or characteristics of the new technology is external and reaches all pos- 
sible adopters in the same way. In the word-of-mouth models, the knowledge 
is diffused by means of personal interactions. 

(1) The broadcasting model (Fig. 4) 
Let us consider a population of K potential adopters of the new technol- 
ogy and let each adopter switch to the new technology as soon as he/she 
hears about its existence (immediate infection through broadcasting). The 
probability that at time t a new subject will adopt the new technology is 
characterized by a coefficient of diffusion n(t) which might or might not be 
a function of the number of previous adopters. In the broadcasting model 
K,(t) = a with (0 < a < 1); this is considered to be a measure of the infection 
probability. 

Let N(t) be the number of adopters at time t. The increase in adopters 
for each period is equal to the probability of being infected, multiplied by 
the current population of non-adopters [HE]- The rate of diffusion at time t 
is 

^ = a[K - N(t)}. (2) 
The integration of (pi) leads to the number of adopters: i.e., 



N(t) is described by a decaying exponential curve. 
(2) Word-of-mouth model (Fig. 5) 

In many cases, however, the technology adoption timing is at least an order 
of magnitude slower than the time it takes for information spreading [89J. 
This requires another modelization than in (1): the word-of-mouth diffusion 
model. Its basic assumption is that knowledge diffuses by means of face-to- 
face interactions. Then the probability of receiving the relevant knowledge 
needed to adopt the new technology is a positive function of current users 




(3) 
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Broadcasting model of technology 

diffusion 




Figure 4: Schematic representation of a broadcasting model of technology 
diffusion. The number of adopters of technology increases by mass media 
influence. 



N(t). Let the coefficient of diffusion be bN(t) with b > 0. The rate of 
diffusion at time t is 

^ = b N(t) [K - N(t)] . (4) 

Then 

N(t) = -, ^ (5) 

- ■ ' K-N Q \ bK (t- to ) v ; 



No 

where A^ = N(t = to). N(t) is described by an S-shaped curve. 

A constraint exists in the word-of-mouth model: it explains the diffusion 
of an innovation not from the date of its invention but from the date when 
some number, N(t) > 0, of early users have begun using it. 
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Word-of-mouth model of 
technology diffusion 
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Figure 5: Schematic representation of a word-of-mouth model of technology 
diffusion. The number of adopters of technology increases by interpersonal 
interactions. 

(3) Mixed information source model (Fig. 6) 

In the mixed information source model, existing non-adopters are subject to 
two sources of information (Fig. 6). The coefficient of diffusion is supposed 
to look like a + bN(t). The model evolution equation becomes 



The result of Eq.([6]) is a (generalized) logistic curve whose shape is deter- 
mined by a and b [88] . 

(4) Time lag Lotka-Volterra model of innovation diffusion (Fig. 



Let it be again assumed that the diffusion of innovation in a society is 
accounted for by a combination of two processes: a mass-mediated process 





7) 
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Mixed information source model 




Figure 6: Schematic representation of mixed information source model. The 
number of adopters increases by mass media influence and interpersonal con- 
tacts. 

and a process connected to interpersonal (word-of-mouth) contacts. Let N(t) 
be the number of potential adopters. Some of the potential adopters adopt 
the innovation and become real adopters. The equation for the the rate of 
growth of the real adopters n(t), in absence of time lag, is 

= a [N(t) - n(t)} + (3n{t)[N{t) - n(t)] - (m{t), (7) 

where a denotes the degree of external influence such as mass media, (5 ac- 
counts for the degree of internal influence by interpersonal contact between 
adopters and the remaining population; /i is a parameter characterizing the 
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Lotka - Volterra model with time lag 




Figure 7: Schematic representation of a Lotka- Volterra model with time lag. 
The model accounts for the time lag between hearing about innovation and 
its adoption. 

decline in the number of adopters because of technology rejection for what- 
ever reason. 

A basic limitation in most models of innovation diffusion has been the 
assumption of instantaneous acceptance of the new innovation by a potential 
adopter [HSJ EH]- Often, in reality, there is a finite time lag between the 
moment when a potential adopter hears about a new innovation and the time 
of adoption. Such time lags usually are continuously distributed [HH E2] ■ 

The time lag between the knowledge about the innovation and its adop- 
tion can be captured by a distributed time lag approach in which the effects 
of time delays are expressed as a weighted response over a finite time inter- 
val through appropriately chosen memory kernels [93] (see Fig. 7). Whence 
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)ccomcs 



Eq.Q b 

= a f dr K{ (t - r) [N(r) - n{r)\ + 
at J® 

(5 [ dr K;(t-r)n(r)[N(T) - n(r)] - /i [ dr K*(t - r)n(r). 



Eq.([8]) reduces to Eq.([7]) when the memory kernels K*{t) (i = 1,2,3) are 
replaced by delta functions. 

Two generic types of kernels are usually considered [92] : 

K*{t) = ve- Vt (9) 
K*(t) = uHe~ vt , (10) 

in which v~ x is some characteristic time scale of the system. 

The number of potential adopters N(t) changes over time. Several possi- 
ble functional forms of N(t) are used [94J: 



N(t) = N (l + at); N >0,a>0 (11) 

N(t) =N exp[gt}; N >0,g>0 (12) 

N(t) = — r; 6>0,rf>0,c>0 (13) 

N(t) = 6 — qexp(-rt); b> 0,q> 0,r > 0. (14) 



Eq.(12) represents an approximation for short- and medium-term forecasting 



since for t large, N(t) grows without bound, as in Keynes [95]. Eqs.(13) and 



(14) are useful in long-term forecasting as N(t) has an upper limit. Such 
forms for N(t) are valid within a deterministic framework. 

However, a stochastic framework (see below) is more appropriate when 
the carrying capacity N(t) is governed by some stochastic process, as when 
the influence of socioeconomic and natural factors are subject to "random" 
or hardly explainable fluctuations. In such systems, N(t) can be time- 
dependent: for example, N(t) ~ N (l + ecos(wt)) where e << 1 and the 
periodicity takes into account the influence of some (strong) cyclic economic 
factors. In presence of a strong stochastic component, N(t) can be stochastic: 
N(t) = N + where the noisy component is and No is the average 
value of the so-called carrying capacity [96] . 
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FOR POLICY-MAKERS 
Take away box Nr. 7: 

Time lags between observations and decisions lead to complicated dynamics. 
Perform some preliminary careful analysis of system behavior based on time 
lags before making a decision. 



3.3 Price model of knowledge growth. Cycles of growth 
of knowledge 
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Figure 8: Diagram of relationships between Price model and its modifica- 
tions. The presence of time lags can lead to much complication in the evo- 
lution dynamics of a scientific field. 
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The Price evolution model of scientific growth ignited intensive research 
[971198] (see Fig. 8). This model is in fact a dialectical addition to Kuhn's idea 
[99] about the revolutionary nature of science processes: after some period 
of evolutionary growth, a scientific revolution occurs. Price considered the 
exponential growth as a disease that retards the growth of stable science, 
producing narrower and less flexible specialists. 



FOR POLICY-MAKERS 
Take-away box Nr. 8 

An interesting result of the research of Price can be read as follows: 
if a government wants to double the usefulness of science, it has 
to multiply by about eight the gross number of workers and the 
total expenditure of manpower and national income. 



The unreserved application of the Price model faces several difficulties: 

• many scientific products which seem to be new are not really new; 

• creativity and innovation can be confused [1U01 1101 j : 

• creative papers with new ideas and results have the same importance 
as trivial duplications [102J; 

• two things are omitted: 

— quality (whatever that means, but it is an economic notion) of 
research; 

— the cost or measure of complexity. 

In answer to this, Price formulated the hypothesis that one should be study- 
ing only the growth of important discoveries, inventions, and scientific laws, 
rather than both important and trivial things. In so doing, one might expect 
that any of such studied growth will follow the same pattern. 

A generalized version of the Price model for the growth of a scientific field 
|103[ 1104] is based on the following assumptions: (a) the growth is measured 
by the number of important publications appearing at a given time; (b) the 
growth has a continuous character, though a finite time period T = const is 
needed to build up a result of the fundamental character; (c) the interactions 
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between various scientific fields are neglected. If, in addition, the number of 
scientists publishing results in this field is constant, then the rate of scientific 
growth is proportional to the number of important publications at time t 
minus the time period T required to build up a fundamental result. The 
model equation is 

^ = ax(t-T), (15) 

where a is a constant. The initial condition x(t) = <f>(t) is defined on the 
interval [— T, 0]. 

Let the population of scientists be varying and consider the evolution of 
the average number of papers per scientist. In general, instead of the linear 



right-hand side Eq.(15), a non-linear model can be used: 

nx 

- = f(x(t-T),x(t)), (16) 

where f(t — T) is a homogeneous function of degree one. The simplest form 
of such a function is a linear function. Let n(t) represent the rate of growth 
of the population of scientists and write L(t) = exp[n(t) t]. For simplicity, 
let the population of scientists grow at the constant rate n = \% and let 
z = x/L. Then the evolution of the number of papers written by a scientist 
has the form 

dz 

j- = az(t-T)-nz(t). (17) 
If n = and T = 0, the Price model of exponential growth is recovered. 



Eq.(17) is linear, but a cyclic behavior may appear because of the feedback 



between the delayed and non-delayed terms. 



3.4 Models based on three or four populations. Dis- 
crete models. 

(1) SIR (Susceptible-Infected-Removed) model (Fig. 9) 

In 1927, Kermack and McKendrick [72] created a model in which they consid- 
ered a fixed population with only three compartments: S(t), the susceptibles; 
I(t), the infected; R(t), the recovered, or removed. 

Following this idea, Goffman and Newill [7"Tj I105j considered the stages of 
fast growth of scientific research in a scientific field as " intellectual epidemics" 
and developed the corresponding scientific research epidemic stage based on 
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SIR model 




Figure 9: SIR (susceptibles S, infectives /, recovered R) model of intellectual 
infection with influxes of susceptibles and infectives to the corresponding 
scientific ideas. 

three classes of population: (i) the susceptibles S who can become infectives 
when in contact with infectious material (the ideas); (ii) the infectives I who 
host the infectious material; and (iii) the recovered R who are removed from 
the epidemics for different reasons (Fig. 9). 

The epidemic stage is controlled by the system of differential equations 

— = -psi -6S + fj., (18) 
d l = pSI- 1 I + v, (19) 

+ < 20 > 

where /i and v are the rates at which the new supply of susceptibles and 
infectives enter the population. A necessary condition for the process to 
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enter the epidemic state is % > 0. Then 

S>^-P (2D 

is the threshold density of susceptibles, i.e., no epidemics can develop from 
time t unless S , the number of susceptibles at that time, exceeds the thresh- 
old p: the epidemic state cannot be maintained over some time interval unless 
the number of susceptibles is larger than p through that interval of time. As 
/ increases, v/I converges to and p converges rapidly to 7//?. 

In [71], Goffman evaluated the rate of change of infectives A I / At. From 
the system equations, it is difficult to determine I(t). Yet in the epidemic 
stage, the behaviour of I(t) is exponential. For small t close to to, I(t) can 
be expanded into a power series: I(t) = Co + C\t + C^t 2 + . . . C n t n + . . . such 
that the approximate rate of AI/At can be obtained. On the basis of this 
rate and the raw data, the development and peak of some research activity 
can be predicted, - under the assumption that the research is in an epidemic 
stage. 

(2) SEIR model for the spreading of scientific ideas (Fig. 10) 

The SIR epidemic models can be further refined by introducing a fourth class, 
E, i.e., persons exposed to the corresponding scientific ideas (Fig. 10). Such 
models are discussed in |106[ I107j ; they belong to the class of so-called SEIR 
epidemic models. One typical model goes as follows 

dS 0SI dE fiSI pEI 

tt =XN -l^> ~dt=l^- KE -l^> (22) 
dl „ pEI r dR 

where S{t) is the size of the susceptible population at time t, E(t) is the size 
of the exposed class, I(t) is the size of the infected class. These individuals 
have adopted the new scientific idea in their publications. Finally, R(t) is the 
size of the population of recovered scientists, i.e., those who no longer publish 
on the topic. The size of the entire population is: N = S + E + I + R. An exit 
term is assumed to be very small, and because of this, t is included in the 
recovered class. N grows exponentially with rate A. The parameters of the 
model are: (3, the probability and effectiveness of a contact with an adopter; 
1/k, the standard latency time, (in other words, the average duration of time 
after one has been exposed but before one includes the new idea in one's own 



24 



SEIR model 




Figure 10: SEIR model of intellectual infection with influxes of susceptibles 
and infectives to the corresponding scientific ideas, thus extending the SIR 
model by including a class of scientists exposed (E) to the specific scientific 
ideas. 

publication); I/7, the duration of the infectious period, thus how long one 
publishes on the topic and teaches others; p, the probability that an exposed 
person has multiple effective contacts with other adopters. 

This simple model can incorporate a wide range of behaviors. For many 
values of the parameters \,(3, k, 7 and p, the infected class grows as a logistic 
curve. For large values of the contact rate /3 or recruitment A, I(t) grows 
nearly linearly, as indeed has been found empirically for some research fields 
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FOR POLICY-MAKERS 
Take away box Nr. 9: 

Epidemic models are the best suited for describing the expansion stage 
of a process growth. 



(3) SI discrete model for the change in the number of authors 
in a scientific field (Fig. 11) 

With the goal of predicting the spreading out of scientific objects (such as 
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Figure 11: Schema of a discrete SI evolution model of the number of authors 
of scientific papers. The model takes into account that several scientists stop 
their work in a scientific field; it can be due to different reasons as for example 
death or losing interest in particular questions. 



theories or methods), Nowakowska |108] discussed several epidemic discrete 
models for predicting changes in the number of publications and authors in a 
given scientific field. With respect to the publications, the main assumption 
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of the models is that the number of publications in the next period of time 
(say, one year) will depend: (i) on the number of papers which recently 
appeared, and (ii) on the degree at which the subject has been exhausted. 
The numbers of publications appearing in successive periods of time should 
first increase, then would reach a maximum, and as the problem becomes 
more and more exhausted, the number of publications would decrease. 

Let it be assumed (Fig. 11) that if at a certain moment t the epidemics 
state is (x t ,yt) (x t is the number of infectives (authors who write papers on 
the corresponding research problems) , yt is the number of susceptibles), then 
for a sufficiently short time interval At, one may expect that the number of 
infectives x t +At will be equal to x t — ax t At + bx t y t At, while the number of 
susceptibles yt+At will be equal to y t — bx t ytAt; a and b being appropriate 
constants. Let the expected number of individuals who either die or recover, 
during the interval (t,t + At), be ax t At, and let bx t ytAt be the expected 
number of new infections. The equations of this model are: 



Note here that such discrete models are useful for the analysis of realistic 
situations where the values of the quantities are available at selected moments 
(every month, every year, etc.). 

(4) Daley discrete model for the population of papers (Fig. 12) 
Daley |109] investigated the spread of news as follows: individuals who 
have not heard the news are susceptible and those who heard the news are 
infective. Recovery is not possible, as it is assumed that the individuals have 
perfect memory and never forget. The Daley model can be applied also to 
the population of papers [T08j (see Fig. 12). For At = 1 (year), the Daley 
model equation reads 



where xi, x 2 .... are the numbers of papers on the subject which appear in 
successive periods of time, b and N being parameters. The expected number 
x t+ i of papers in year t + 1 is proportional to the number x t of papers which 
appeared in year t, and to the number iV — x\ — X2 ■ ■ ■ — x t = N — Yh=i 
N is the number of papers which have to appear in order to exhaust the 



Xt+At = ax t - ax t At + bx t y t At 
yt+At = yt- bx t y t At. 



(24) 
(25) 




(26) 
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Figure 12: Daley model for evolution of population of papers on problems in 
a scientific field. The exhausting of the scientific field is taken into account. 



problem: the problem under consideration may be partitioned into iV sub- 
problems, such that solving any of them is worth a separate publication; 
these subproblems are solved successively by the scientists. The b and N 
parameters may be estimated by the method of least squares, e.g. from a 
given empirical histogram. A parameter characterizing the initial growth 
dynamics in the number of publications can also be introduced: r = bN. 
Therefore, Eq.(26) can be used for short-time prediction, even when the 
corresponding research field is in the epidemic stage of its evolution. 

(5) Discrete model coupling the populations of scientists and 
papers (Fig. 13) 

A discrete model coupling the populations of scientists and papers can be 
considered (Fig. 13); it depends on four parameters: N, a, b and c. iV as 
above denotes the number of sub-problems of the given problem; a is the 
probability that a scientist working on the subject in a given year abandons 
research on the subject for whatever reasons; b is the probability of obtaining 
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Coupled discrete model for 
populations of scientists and papers in 
a research field 




Figure 13: Discrete model for the joint evolution of populations of scientists 
and papers. The attractiveness of the field, the exhaustion of the field, and 
the possibility for declining interest for working in the scientific field are taken 
into account through adequate rate parameters. 

a solution to a given subproblem by one scientist during one year of research; 
c denotes the coefficient of attractiveness of the subject. The basic variables 
of the model are: Ut, the number of scientists working on the subject in year 
t, and x t , the number of publications on the subject which appear in year t. 
The model equations are 

u t+ i = (1 - a)u t + cx t (27) 
x t+ i = [1-(1-&h(^-£x^. (28) 

The equation for the number u t +i of scientists working on the subject in year 
t + 1 tells that in year t + 1, the expected number of scientists working on the 
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subject will be the number of scientists working on the subject in year t, u t} 
minus the expected number of scientists who stopped working on the subject, 
aut, plus the expected number of scientists, cx t , who became attracted to 
the problem by reading papers which appeared in year t. The equation 
expressing the number of publications in year t+1 tells us that x t+ \ equals 
the number of subproblems that were solved in the year t. The probability 
that a given subproblem will be solved in year t by a given scientist equals 
b. Then the probability of the opposite event, i.e. a given scientist will not 
solve a particular problem, equals 1 — b. As there are u t scientists working 
on the subject in year t, the probability that a given subproblem will not be 
solved by any of them is (1 — b) Ut . Consequently, the probability that a given 
subproblem will be solved in year t (by any of the ut scientists working on the 
subject) is equal to 1 — (1 — b) Ut . Next, in year t there remained N — Yh=x x % 
subproblems to be solved. The expected number of subproblems solved in 
year t is equal to the product which gives the right-hand side of Eq.(28). 

N.B. It is assumed, that the waiting time for publishing of the paper is 
one year. A more realistic picture would be to assume that the unit of time 
is not one year, but two years, or that the publication has some other time 
delay. 



FOR POLICY-MAKERS 
Take away box Nr. 10: 

In many cases, the data is available as one value per week, or one value 
per month, or one value per three months, etc. For modeling and subse- 
quent short-range forecasting, so-called discrete (time) models are thus very 
appropriate. 



3.5 Continuous models of the joint evolution of scien- 
tific sub-systems 

(1) Coupled continuous model for the populations of scientists and 
papers: Goffman-Newill model 

The Goffman-Newill model |105j (Fig. 14) is based on the idea that the 
spreading process within a population can be studied on the basis of the 
literature produced by the members of that population. There is a transfer 
of infectious materials (ideas) between humans by means of an intermediate 
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Figure 14: Schema of Goffman-Newill model for the evolution of a scientific 
field. Scientists are attracted to a sub-field after being intellectually infected 
by papers from the sub-field. 

host (a written article). Let a scientific field be F and SF a sub-field of F. 
Let the number of scientists writing papers in the field F at to be No and the 
number of scientists writing papers in SF at to (the number of infectives) be 
Jo- Thus, Sq = No — Io is the number of susceptibles; there is no removal at 
to, but there is removal R(t) at later times t. The number of papers produced 
on F at to is N and the number of papers produced in SF at this time is 
I'q. The process of intellectual infection is as follows: (a) a member of F 
is infected by a paper from J'; (b) after some latency period, this infected 
member produces 'infected' papers in N', i.e. the infected member produces 
a paper in the subfield SF citing a paper from I'; (c) this 'infected' paper 
may infect other scientists from F and its sub-fields, such that the intellectual 
infection spreads from SF to the other sub-fields of F . 

Let (3 be the rate at which the susceptibles from class S become 'intel- 
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lectually infected' from class 1. Let (3' be the rate at which the papers in 
SF are cited by members of N who are producing papers in SF. As the 
infection process develops, some susceptibles and infectives are removed, i.e. 
some scientists are no longer active, and some papers are not cited anymore. 
Let 7 and 7' be the rates of removal of infectives from the populations / and 
V respectively, and 5 and 5' be the rates of removal from the populations of 
susceptibles S and S'. In addition, there can be a supply of infectives and 
susceptibles in N and N'. Let the rates of introduction of new susceptibles 
be n and //, i.e. the rates at which the new authors and new papers are in- 
troduced in F, and let the rates of introduction of new infectives be v and v', 
i.e. the rates at which new authors and new papers are introduced in SF. In 
addition, within a short time interval a susceptible can remain susceptible or 
can become an infective or be removed; the infective can remain an infective 
or can become a removal; and the removal remains a removed. The immunes 
remain immune and do not return to the population of susceptibles. If, in 
addition, the populations are homogeneously mixed, the system of model 
equations reads 

*L = -0Sr-6S + ii; ^L = 0sr-il + v (29) 

^ = 1 I + 5S; ^- = -^S'I-5S' + ti' (30) 
at at 

^ = ^/-y/' + t/; d ^ = 1 'l' + 5'S' (31) 

The conditions for development of an epidemic are as follows. If as an initial 
condition at to, a single infective is introduced into the populations N and 
Nq, then for an epidemic to develop, the change of the number of infectives 
must be positive in both populations. Then, for p = and p' = ^j^r-, 
the threshold for the epidemic arises from the conditions /3SF > 7/ — v and 
(3' 'ST > 7'/' — v', such that the threshold is 

SoS'o > pp'. (32) 

The development of epidemics is given by the equation ^ — D(t). The peaks 
of the epidemic occur at time points where ^ = 0, while the epidemic's size 
is given by I(t — > 00). 

(2) Bruckner-Ebeling-Scharnhorst model for the growth of n 
subfields in a scientific field 
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Bruckner-Ebeling-Scharnhorst model 
for for evolution of n sub-fields of a 
scientific field 




Figure 15: Schema of Bruckner-Ebeling-Scharnhorst model of evolution of n 
scientific sub-fields. Self-reproduction and decline of subfields as well as field 
mobility are taken into account. 

The evolution of growth processes in a system of scientific fields can be mod- 
eled by complex continuous evolution models. One of them, the Bruckner- 
Ebeling-Scharnhorst approach |110] (Fig. 15), is closely related to several 
generalizations of Eigen's theory of prebiotic evolution and is briefly discussed 
here (see also jlllj ). In 1912, Lotka |112j published the idea of describing 
biological epidemic processes, like malaria, as well as chemical oscillations, 
with the help of a set of differential equations. These equations, known as 
Lotka- Volterra equations |113l 1114] , are used to describe a coupled growth 
process of populations. However, they do not reflect several essential proper- 
ties of evolutionary processes such as the creation of new structural elements. 
Because of this, one has to consider a more general set of equations for the 
change in the number Xj of the scientists from the z-th scientific subfield (a 
Fisher-Eigen-Schuster kind of model), i.e., 

dxi n n 

j. C^t -Di)xi -\- ^ ] (v4jjXj Aj{Xij -\- ^ ^ BijXiXj koXi, 



33 



i,j = l,...,n. (33) 



The model based on Eq.(33) describes the coupled growth of n sub fields, of a 



scientific discipline. Three fundamental processes of evolution are included in 



Eq.(33) : (a) self- reproduction: students and young scientists join the field 
and start working on corresponding problems. Their choice is influenced 
mainly by the education process as well as by individual interests and by 
existing scientific schools; (b) decline: scientists are active in science for 
a limited number of years. For different reasons (for example, retirement) 
they stop working and leave the system; (c) field mobility: individuals turn 
to other fields of research for various reasons or maybe open up new ones 
themselves. 



The reasoning to obtain Eq.(33) goes as follows. The general form of the 



law for growth of the i-th subfield is supposed to be 

dx 



fi(x), x = (xt, ...,x n ). (34) 
By separation, /j = WiXi, one obtains the replicator equation 

^■ = WiXi, i = l,2,...,n. (35) 

Notice that when Wi = const, the fields are uncoupled, i.e., there is an 
exponential growth in science. Otherwise, Wi itself is a function of x and of 
various parameters, but can be separated into three terms according to the 
above model assumptions , i.e., 

n f x ■ \ 
w i = A l -D i + ]T [Aij— — AiA . (36) 



Eq.(33) is thus obtained from Eq.(35) and Eq.(36) for = 0, k = 



To adapt this model to real growth processes, it can be assumed that the 
coefficients Ai, D i: and Ay themselves are functions of xf 

At = A° + A]xi + ...; Di = A° + D} Xi + ...; Ay = A% + A\.x, + ... (37) 



Each of the three fundamental processes of change is represented in Eq.(33) 
with a linear and a quadratic term only. For example, the terms A\ and 
D] account for cooperative effects in self-reproduction and decline processes 
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respectively, while accounts for a decline, because of aging. The con- 
tributions A*-j assume a linear type of field mobility behavior for scientists 
analogous to a diffusion process. On the other hand, the terms Ajj represent 
a directed process of exchange of scientists between fields. The best way to 
obtain these parameters is to estimate them for specific data bases using the 
method of least squares. 



FOR POLICY-MAKERS 
Take away box Nr. 11: 

The Bruckner-Ebeling-Scharnhorst model does not belong to the class of 
epidemic models which are best applicable only for describing the expansion 
stage of a process. 

The Bruckner-Ebeling-Scharnhorst model is an evolution model: it describes 
all stages of the evolution of a system. 



4 Small-size scientific and technological sys- 
tems. Stochastic models (Fig. 16) 



Figure 16: Hierarchy of stochastic models discussed in this chapter. 
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The movement of large bodies in mechanics is governed by deterministic 
laws. When the body contains a small number of molecules and atoms, 
stochastic effects such as the Brownian motion become important. In the 
area of scientific systems, the fluctuations become very important when the 
number of scientists in a certain research subfield is small. This is typical for 
new research fields with only a few researching scientists. 

Several examples of stochastic models for the description of the diffusion 
of ideas or technology and the evolution of science are: (a) the model of 
evolution of scientific disciplines with an example pertaining to the case of 
elementary particles physics |115] ; (b) stochastic models for the aging of 
scientific literature |llb'| ; (c) stochastic models of the Hirsch index [S3] and 
of instabilities in evolutionary systems |117j : (d) models of implementation 
of technological innovations |118j . etc. |119j . In the following, see Fig. 16, 
two probabilistic and two stochastic models are discussed. Some attention is 
devoted to the master equation approach as well. 

4.1 Probabilistic SI and SEI models 

Epidemiological models of differential-equation-based compartmental type 
have been found to be limited in their capacity to capture heterogeneities at 
the individual level and in the interaction between individual epidemiological 
units |120j . This is one of the reasons to switch from models in which the 
number of individuals are in given known states to models involving proba- 
bilities. One such model |121j captures the diffusion of topics over a network 
of connections between scientific disciplines, as assigned by the ISI Web of 
Science's classification in terms of Subject Categories (SCs). Each SC is 
considered as a node of a network along with all its directed and weighted 
connections to other nodes or SCs |121[ 1122] . As with epidemic models, 
nodes can be characterized in a medical way. SCs that are susceptible (S) 
are either not aware of a particular research topic or, if aware, may not be 
ready to adopt it. Incubating SCs (E) are those that are aware of a certain 
topic and have moved to do some research on problems connected with this 
topic. Infected SCs (/) are actively working and publishing in a particular 
research topic. 

Two probabilistic models, i.e., (i) the Susceptible- Exposed-Infected (SEI) 
model (Fig. 17) and (ii) a simpler Susceptible- Infected (SI) model (Fig. 18), 
are thereby only discussed. 

(1) Susceptible-Exposed-Infected (SEI) model 
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The SEI model equations for the evolution of the node state probabilities 



Probabilistic SEI model on a network 
connecting scientific disciplines 




Figure 17: Schema of the probabilistic SEI model for epidemics in a network 
connecting scientific disciplines. 



are given by [121J: 

dSjjt) 
dt 

dEi(t) 



dt 



= -X)A ii J i (t)5' < (t), (38) 

3 

~£A, i I,(t)S,(t)-lE i (t), (39) 



f = « (40) 
where < Ij(t) < 1 denotes the probability of node i being infected at time 
t (likewise for Si(t) and E{(t)). The directed and weighted contact network 
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is represented by Ay = rTy with ITy = (wjjjy^i,..,^ denoting the adjacency 
matrix that includes weighted links; r is the transmission rate per contact 
and I/7 is the average incubation or latent period. 

This set of equations states that an increase in the probability of a 
node i being exposed to an infection is directly proportional to the proba- 
bility Si of node % being susceptible and the probability Ij of neighbouring 
nodes j being infected. The number of such contacts and the per-contact 
rate of transmission are incorporated in Ay. Likewise, E{ decreases if ex- 
posed/infected nodes become infected after an average incubation time I/7. 
The number of infected SCs at time t, according to the model, can be esti- 
mated as I(t) = J2i Ii{t)- Since Si(t) + Ei(t) + Ii(t) = 1, for each t > 0, Eqs. 



(38) - (40) are readily understood, in view of Eq.(39). 
(2) Susceptible-Infected (SI) model 

The above SEI model can be simplified to an SI model when the possibility 

Probabilistic SI model on a network 
connecting scientific disciplines 



Probability that a 
network node is 
susceptible to 
intellectual 
infection (S) 



Probability of 
infection of a 
node (I) 



Transmission 
rate per 
contact 



Figure 18: Schema of the probabilistic SI model for epidemics in a network 
connecting scientific disciplines. 

of an exposed period is excluded, i.e,. if dE lf = 0. The equations for this 
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'ZAMQSi®, (41) 
j 

where the probability U of a node i being infected and infectious only depends 
on the probability Si of the node i being susceptible. The comparison of 
both models with available data shows |121] that while the agreement at 
the population level is usually much better for the SEI model, for the same 
pair of parameters, the agreement at the individual level is better when the 
simpler SI model is used. 



simpler SEI model are reduced to 

f = -EM(»)«(«); ^ 



4.2 Master equation approach 

(1) Stochastic evolution model with self-reproduction, decline, and 
field mobility 

There exists a high correlation between field mobility processes and the 
emergence of new fields jllUj . This can be accounted for by a stochastic model 
(see Fig. 19), in which the system at time t is characterized by a set of integers 
Ni , N 2 , Ni, N n , with Ni being, e.g., the number of scientists working 
in the subfield i, which is considered now as a stochastic variable. The three 
fundamental types of scientific change mentioned in the discussion of the 
Bruckner-Ebeling-Scharnhorst model (see above) here correspond to three 
elementary stochastic processes with three different transition probabilities: 

• (a) For self-reproduction, the transition probability is given by 
W(N i + l\N i ) = A°N i = A$N i + A$N i (N i -l); 

• (b) The transition probability for decline is 
W(N t - 1 | = D°N t + DjNiiNi - 1); 

• (c) The transition probability for field mobility is 

W(Ni + 1, Nj - 1 | NiNj) = A^Nj + AjjNiNj. 

The probability density P(N\, . . . , Ni, Nj, . . . , t) is given by the so-called 
master equation 

dP 

- = WP (42) 
which can be solved analytically only in some very special cases (123] . 
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Figure 19: Schema of the master equation model of evolution of scientific 
fields in presence of self-reproduction, decline, and field mobility. 



(2) The master equation as a model of scientific productivity 

The productivity factor is a very important ingredient in mathematically 
simulating a scientific community evolution. One way to model such an evo- 
lution is through a dynamic equation which takes into account the stochastic 
fluctuations of scientific community members productivity |127] (Fig. 20). 
The main processes of scientific community evolution accounted for by this 
model are, beside the biological constraints (like the self-reproduction, ag- 
ing of scientists, and death), their departure from the field due to mobility 
or abandon of research activities. Call a the age of an individual and let a 
scientific productivity index £ be in incorporated into the individual state 
space; both a and £ are being considered to be continuous variables with val- 
ues in [0, oo]. The scientific community dynamics is described by a number 
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Figure 20: Schema of the master equation model for scientific productivity. 

density function n(a,£, t), - another form of scientific landscape, which spec- 
ifies the age and productivity structure of the scientific community at time 
t. For example, the number of individuals with age in [01,02] and scientific 
productivity in £ 2 ] at time t is given by the integral da d£ n(a, £, t). 

A master equation for this function n(a, £, t) can be derived |127j : 

d d\ 

da + dt) = ~\- J ( a >€> t ) + w ( a ^^ t )\ n{a,^,t) + 

f X (a^-e,t)n(a,^-e,t), (43) 
J— 00 

where w(a, £, t) denotes the departure rate of community members. If x(t) 
is a random process describing the scientific productivity variation and if 
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p a (x,t | y,r) (with r < t) is the transition probability density corresponding 
to such a process, then 

M , t) = ^^±^±MM. (44) 

The transition rate, at time t from the productivity level £, J(a, £, t) is by 
definition: J(a, £,t) = f^d£,' x( a i £>£'j0- The increment £' may be positive 
or negative. The balance equation for n(a, £, t) reads as follows 

n(a + Aa, f , t + At) = n(a, f , t) - J (a, £, t) n(a, f , t) At + 
^ x(a, £-£',*) n(a, At-w(a,e,t) n(a,e,t) At. (45) 

-oo 

The term on the right-hand side, [1 — J(a, £, t)At]n(a, £, t), describes the 
proportion of individuals whose scientific productivity does not change in 
[t,t + At]; the integral term describes the individuals whose scientific pro- 
ductivity becomes equal to £ because of increasing or decreasing in [t, t + At] ; 
the last term corresponds to the departure of individuals due to stopping re- 
search activities or death. After expanding n(a + At, £, t + At) around a and 
t, keeping terms up to the first order in At, one obtains the master equation 



Eq.(|43j). 

As the master equation is difficult to handle for an elaborate analysis, 
it is often reduced to an approximated equation similar to the well-known 
Fokker-Planck equation |124l 11251 H2b] . The approximation goes as follows. 
Let 

/oo 1 
daO k x(a, £, £', t) = hm — < (£')* >; k = 1, 2, . . . , 

(46) 

where the brackets denote the average with respect to the conditional prob- 
ability density p a (£ + £', t + At | £, t). In addition, the following assumptions 
are made: (i) ii\-,\L% < oo; ^ = for k > 3; (ii) n(a, £, t) and x( a ;£;£' 5 ^) 
are analytic in £ for all a, t and £'. The additional assumption //& = 
for A; > 3 demands the productivity to be continuous in the sense that as 
At —7- 0, the probability of large fluctuations | £' | must decrease so quickly 
that <| £' | 3 >^ more quickly than At. 

When the above assumptions hold, the function n satisfies the equation 

(fc + «) B= -« + 2-5F i -' ra - (47) 
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If w = 0, Eq.(47) is converted to the well known Fokker-Planck equation. 
Eq.(47) describes the scientific community evolution through a drift along 
the age component and a drift and diffusion with respect to the productivity 
component. The diffusion term characterized by the diffusivity /i 2 takes into 
account the stochastic fluctuations of scientific productivity conditioned by 
internal factors (such as individual abilities, labour motivations, etc.) and 
external factors (such as labor organization, stimulation system, etc.). The 



initial and boundary conditions for Eq.(47) are: (a) n(a, £, 0) = n°(a, £), 
where n°(a,£) is a known function defining the community age and pro- 
ductivity distribution at time t = 0; and (b) n(0,£,t) = z^(£,£) where the 
function z/(£, t) represents the intensity of input flow of new members at age 
a = being set z/(£, 0) = n°(0,£). In addition, n(a,£,t) — )■ as a — >• oo. 



The general solution of equation Eq.(47) with the above initial condition 
(a) and boundary condition (b) is still a difficult task. However, for many 
practical applications, a knowledge of first and second moments of distribu- 



tion function n(a,£,t) is sufficient. Eq.(47) can be solved numerically or can 



be reduced to a system of ordinary differential equations [127J. 



FOR POLICY-MAKERS 
Take away box Nr. 12: 

In deterministic cases, the system is robust against fluctuations: it follows 
some trajectory and the fluctuations are too weak to change it. When the 
fluctuations are important, then different trajectories for the evolution of the 
system become possible. To each trajectory, a probability can be assigned. 
This probability reflects the chance that the system will follow the corre- 
sponding trajectory. The collection of the probabilities leads to a probability 
distribution which can be calculated, in many evolutionary cases, on the basis 
of the master equation approach. 



Finally, two additional problems that can be treated by the master equa- 
tion approach can be mentioned: 

• Age-dependent models where the birth and death rates connected to 
the selection are age-dependent [1281 129J 

• The problem of new species in evolving networks On the basis 
of a stochastic treatment of the problem, the notion of 'innovation' can 
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be introduced in a broad sense as a disturbance and/or an instability 
of a corresponding social, technological, or scientific system. The fate 
of a small number of individuals of a new species in a biological system 
can be thought to be mathematically equivalent to some extent to the 
fate of a new idea, a new technology, or a new model of behavior. 
The evolution of the new species can be studied on evolving networks, 
where some nodes can disappear and new nodes can be introduced. 
This evolution of the network can change significantly the dynamic 
behavior of the entire system of interacting species itself. Some of 
the species can vanish in a finite time. This feature can be captured 
effectively by the master equation approach. 

5 Space-time models. Competition of ideas. 
Ideological struggle 

A further level of complication is to include spatial variables explicitly in the 
above models describing the diffusion of ideas. At this stage of globalization 
of economies, with several of its concomitant features, like idea, knowledge, 
and technology diffusion, to consider the spatial aspect is clearly a must. A 
large amount of research on the spatial aspects of diffusion of populations is 
already available. As examples of early work, papers by Kerner |130j . Allen 
|131j . Okubo |132j . and Willson and de Roos |133j can be pointed out. From 
the point of view of diffusion of ideas and scientists, the previously discussed 
continuous model of research mobility [HOj has to be singled out. Moreover, 
the model presented below is closely connected to the space-time models of 
migration of populations developed by Vitanov and co-authors |134[ 1135] . 
In addition, a reproduction-transport equation model (see Fig. 21) can be 
discussed. 

5.1 Model of competition between ideologies 

The diffusion of ideas is necessarily accompanied by competition processes. 
One model of competition between systems of ideas (ideologies) goes as fol- 
lows (Fig. 22). Let a population of N individuals occupy a two-dimensional 
plane. Suppose that there exists a set of ideas or ideologies P = {P , Pi, ... , P n } 
and let iVj members of the population be followers of the Pi ideology . The 
members Nq of the class Pq are not supporters of any ideology; in some 
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Figure 21: Relations between space-time models discussed in this chapter. 

sense, they have their own individual one and do not wish to be considered 
associated with another one, global or not. In such a way, the population 
is divided in n + 1 sub-populations of followers of different ideologies. The 
total population is: N = N + Ni + . . . N n . Let a small region AS = Ax Ay 
be selected in the plane. In this region there are ANi individuals holding the 
i-th ideology, i = 0, 1, . . . , n. If AS is sufficiently small, the density of the 
i-th population can be defined as pi(x,y,t) = ^y-. 

Allow the members of the i-th population to move through the borders of 
the area AS. Let ji(x, y, t) be the current of this movement. Then (ji-n)Sl is 
the net number of members of the i-th population/ideology, crossing a small 
line SI with normal vector n. Let the changes be summarized by the function 
Ci(x, y, t). The total change of the number of members of the i-th population 
is 

9 £ + dwj i = C i . (48) 
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Figure 22: Schema of the space-time model for describing competition be- 
tween ideologies. 



The first term in Eq.(48) describes the net rate of increase of the density of 



the i-th population. The second term describes the net rate of immigration 



into the area. The r.h.s. of Eq. (48 ) describes the net rate of increase exclusive 
of immigration. 

Let us now specify ji and Cf. ji is assumed to be made of a non-diffusion 
part ji and a diffusion part jP where jP is assumed to have the gen- 
eral form of a linear multicomponent diffusion |130] in terms of a diffusion 
coefficient 

n 

Ji = fP + IP = IP - E D *{Pi, Pk, x, y, t)V Pk . (49) 

k=0 

Let some of the followers of the ideology Pi be capable of and interested 
in changing ideology: i.e., they can convert from the ideology P, to the 
ideology Pj. It can be assumed that the following processes can happen 
with respect to the members of the subpopulations of the property holders: 
(a) deaths: described by a term r^pj. It is assumed that the number of 
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deaths in the z-th population is proportional to its population density. In 
general r, = ri(p v , x, y, t;p M ), where p„ stands for (p , Pi, • • • , Pn) and p M 
stands for (pi, . . . ,Pm) containing parameters of the environment; (b) non- 
contact conversion: in this class are included all conversions exclusive of 
the conversions by interpersonal contact between the members of whatever 
populations. A reason for non-contact conversion can be the existence of 
different kinds of mass communication media which make propaganda for 
whatever ideologies. As a result, members of each population can change 
ideology. For the z-th population, the change in the number of members is: 
Y,]= fijPj, fa = 0. In general, f tj = fij(pv,x,y,t;pp); (c) contact conversion: 
it is assumed that there can be interpersonal contacts among the population 
members. The contacts happen between members in groups consisting of two 
members (binary contacts), three members (ternary contacts), four members, 
etc. As a result of the contacts, members of each population can change 
their ideology. For binary contacts, let it be assumed that the change of 
ideology probability for a member of the j-th population is proportional to 
the possible number of contacts, i.e., to the density of the z-th population. 
Then the total number of "conversions" from Pj to P, is a^piPj, where aij is 
a parameter. In order to have a ternary contact, one must have a group of 
three members. The most simple is to assume that such a group exists with 
a probability proportional to the corresponding densities of the concerned 
populations. In a ternary contact between members of the z'-th, j-th, and 
k-th population, members of the j-th and fc-th populations can change their 
ideology according to Pi = bijkPiPjPk, where is a parameter. In general, 
o-ij = aij{Pu,x,y,t;p^); b ijk = b ijk (p u ,x,y,t;p fl )] etc. 

On the basis of the above, the Cj term looks as follows (for more research 
of these types of population models see |136[ 1137} 1138] ): 

n n n 

Ci = npi + Y fijPj + Y aijPiPj + Y hjkPiPjPk + • • • , (50) 

3=0 3=0 j,k=0 

and the model system of equations becomes 

dp- i n n 

~kt + dfvjp - Y div (Aj Vpj) = npi + fijPj + 

j=0 j=0 
n n 

Y a ijPiPj + H BijkPiPjPk + ■■■ (51) 
3=0 j,k=0 

The density of the entire population is p = YJi=oPi- It can be assumed 
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that it changes in time according to the Verhulst law (but see the note after 



Eq.(56)! 



^ rp(l-Z) (52) 



dt r \ C, 

where C{p V) x,y,t;p^) is the so-called carrying capacity of the environment 
[96] and r(p„, x, y, t; p M ) is a positive or negative growth rate. When pertinent 
sociological data are available, the same type of equation could hold for any 
i-th population with a given rj. 

First, consider the case in which the current ffi is negligible, i.e., ~jf ~ 
{no diffusion approximation). In addition, consider only the case when all 
parameters are constants. The model system of equations becomes 

dpi U n n 

-zr - D ij z2 a pj = r ipi + zl fop* + zl a npipi + 

Ul j=0 j=0 3=0 

n 

bijkPiPjPk + • • • , (53) 

j,k=0 

for 

A = ^ + ^, i = 0,l,2,..., n . (54) 

Let plane-averaged quantities and fluctuations (linear or nonlinear) be 
enough relevant. Let q(x,y,t) be a quantity defined in an area S. By defi- 
nition, a plane-averaged quantity is q = g J J s dxdy q(x,y,t). Call the fluc- 
tuations Q(x,y,t) such that q(x,y,t) = q(t) + Q(x,y,t). If the territory 
is large and within the stationary approximation, S can be assumed to be 
large enough such that each plane-averaged combination of fluctuations van- 
ishes, such that Qi = QiQj = QiQjQk = . . . = 0. In addition to S being 
large and J J s dxdyAQ k assumed to be finite, it can be also assumed that 
AQ fc = ^JJ S dxdyAQ k -»■ 0. 

On the basis of the above (reasonable) assumptions, it is possible to 
separate the dynamics of the averaged quantities from the dynamics of fluc- 



tuations. As a result of the plane-average of Eq.(53), the following equations 



for the dynamics of the plane-averaged densities are obtained 

Po = P-±P, d i = r p( 1 -^) (55) 
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dp. _ n _ n _ _ n 

-TT = ripi + fijPj + a vPiPj + h ijkPiPjPk + • • ■ (56) 

jf=0 i=0 j,fe=0 



Instead of (55) we can write an equation for p from the kind of (56). Then 



the total population density p will not follow the Verhulst law. 



Equations (55) and (56) represent the model of ideological struggle pro- 
posed by Vitanov, Dimitrova and Ausloos [139]. There is one important 
difference between the Lotka-Volterra models |112[ 1114] , often used for de- 
scribing prey-predator systems, and the above model of ideological struggle. 
The originality resides in the generalization of usual prey-predator models 
to the case in which a prey (or predator) changes its state and becomes a 
member of the predator pack (or prey band), due to some interaction with its 
environment or with some other prey or predator. Indeed, it can be hard for 
rabbits and foxes to do so, but it can be often the case in a society: a member 
of one population can drop his/her ideology and can convert to another one. 

In order to show the relevance of such extra conditions on an evolution 
of populations, consider a huge (mathematical) approximation, - it might 
be a drastic one in particular in a country with a strictly growing total 
population. (Recall that the growth rate r could be positive or negative or 
time-dependent). Let r be > and let the maximum possible population of 
the country be C . Consider more convenient notations by setting p = N; 
p = N ] fa = Ni and assume that the binary contact conversion is much 
stronger than the ternary, etc. conversions. The system equations become 

n dN / N\ 

" = + *=^(i--) (57) 



dt 



nNi + ^fijNi + ^bijNiNj. 

3=0 3=0 



(58) 



Reduce the discussion of Eqs.(57) and (58) to a society in which there is 



the spreading of only one ideology; therefore, the population of the country 
is divided into two groups: N\, followers of the "invading" ideology and No, 
people who are at first "indifferent" to this ideology. Let only the non-contact 
conversion scheme exist, as possibly moving the ideology-free population to- 
ward the single ideology; thus f w is finite, but b w = 0. Let the initial 
conditions be N(t = 0) = N(0) and N^t = 0) = iVi(O). The solution of the 
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system of model equations is 



N(t) 



CN(0) 



(59) 



N(0) + (C 



N(0))e 



-rt ' 



like the Verhulst law, but 




C-N(0) Ao-ri " 

JV(0) ' ' r 

C-N(0) f 10 - n 

iV(0)e rt ' ' r 




with 



iVo(t) = N(t) 



Ni(t) 



(61) 



in which $ is the special function a, i>) = Z)^=o 7^+^ > I z K 1- 

The obtained solution describes an evolution in which the total popula- 
tion N reaches asymptotically the carrying capacity C of the environment. 
The number of adepts of the ideology reaches an equilibrium value which 
corresponds to the fixed point N% = (7/io/(/io _ r i) of the model equation 
for The number of people who are not followers of the ideology asymp- 
totically tends to N = C - N x . Let (7 = 1, /i = 0.03, and n = -0.02, then 
N\ = 0.6, which means that the evolution of the system leads to an asymp- 
totic state in which 60 % of the population are followers of the ideology and 
40 % are not. 

Other more complex cases with several competing ideologies can be dis- 
cussed, observing steady states or/and cycles (with different values of the 
time intervals for each growth or/and decay), chaotic behaviors, etc. [139J. 
In particular, it can be shown that accepting a slight change in the condi- 
tions of the environment can prevent the extinction of some ideology. After 
almost collapsing, some ideology can spread again and can affect a significant 
part of the country's population. Two kinds of such resurrection effects have 
been found and described as phoenix effects in the case of two competing 
ideologies. In the phoenix effect of the first kind, the equilibrium state con- 
nected to the extinction of the second ideology exists but is unstable. In the 
phoenix effect of the so-called second kind, the equilibrium state connected 
to extinction of the second ideology vanishes. In fine, the above model seems 
powerful enough to discuss many realistic cases. The number of control pa- 
rameters seems huge, but that is the case for many competing epidemics in 
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complex systems. However, it was observed that the values of parameters 
can be monitored when enough data is available, including the time scales 



FOR POLICY-MAKERS 
Take away box Nr. 13: 

Space-time models are very appropriate for modeling migration processes 
such as the spatial migration of scientists, besides the diffusion of ideas 
through competition without strictly physical motion. 



5.2 Continuous model of evolution of scientific sub- 
fields. Reproduction-transport equation 

The change of subject of a scientist can be considered as a migration process 
|110|ll4Uj . Let research problems be represented by sequences of signal words 
or macro-terms Pi = (mj,mf, . . . , m*, . . . , m") which are registered according 
to the frequency of their appearance, joint appearance, etc., respectively, 
in the texts. Each point of the problem space, described by a vector q, 
corresponds to a research problem, with the problem space consisting of all 
scientific problems (no matter whether they are under investigation or not). 
The scientists distribute themselves over the space of scientific problems with 
density x(q,t). Thus, there is a number x(q,t)dq working at time t in the 
element dq. The field mobility processes correspond to a density change of 
scientists in the problem space: instead of working on problem q, a scientist 
may begin to work on problem q' . As a result, x(q,t) decreases and x(q',t) 
increases. This movement of scientists (see also Fig. 23) can be described by 
means of the following reproduction-transport-equation: 

-z(£ t M5N + ^(/(?>) + fl(5l^y (62) 



dt dq \ dq 



In Eq.(62), self-reproduction and decline are represented by the term w(q 



x) x(q,t). For the reproduction rate function w(q | x), one can write 

w(q\ x) = a(q) + J dq' b(q,q') x(q',t). (63) 
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Reproduction-transport equation 
model of joint evolution of scientific 
sub-fields 




Figure 23: Schema of the reproduction-transport equation model of joint 
evolution of scientific fields. 

The local value of a(q) is an expression of the rate at which the num- 
ber of scientists on field q is modified through self-reproduction and/or de- 
cline, while b(q, q') describes the influence exerted on the field q by the 
neighbouring field q'. The field mobility is modeled by means of the term 
^lf(q,x) + D(q)^x(q, t)J . In most cases, Eq.(62) can only be solved nu- 
merically. For more details on the model, see jllOj . 
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Figure 24: Statistical laws and their relationships as discussed in the chapter. 



6 Statistical approaches to the diffusion of 
knowledge 

Solomon and Richmond [761 HI] have shown that the systems of generalized 
Lotka-Volterra equations are closely connected to the Pareto-Zipf probability 
distribution. Since such a distribution arises among other distributions and 
laws connected to the description of the diffusion of knowledge, it is of inter- 
est to discuss briefly the diffusion of knowledge within statistical approach 
studies. Lotka was its pioneer; a large amount of research has followed. Just 
as examples, one can mention the work of Yablonsky and Haitun on the 
Lotka law for the distribution of scientific productivity and its connection 
with the Yule distribution [1411 1142[ 1143] , where the non-Gaussian nature of 
the scientific activities is emphasized. Interesting applications of the Zipf law 
are also presented in [144] . The connection to the non-Gaussian distributions 
concepts of self-similarity and fractality have been applied to the scientific 



53 



system in |145] and |146j . Several tools for appropriate statistical analysis 
are hereby discussed. At the center of the discussion Lotka law shall receive 



As part of this discussion on the statistical approach, the analysis of the 
productivity of scientists can be considered. The information connected to 
new ideas is thought to be often codified in scientific papers. Thus, the 
statistical aspects of scientific productivity is of practical importance. For 
example, the Lotka law reflects the distribution of publications over the set 
of authors considered as the information sources. Bradford law describes 
the distribution of papers on a given topic over the set of journals publishing 
these papers and ranked according to the order in the decrease of the number 
of papers on a given topic in each journal. These laws have a non-Gaussian 
nature and, because of this, possess specific features such as a concentration 
and dispersal effect |141j : for example, it is found that there is a small number 
of highly productive scientists who write most of the papers on a given topic 
and, on the other hand, a large number of scientists with low productivity. 

In order to give an example of the connection between the deterministic 
and statistical approaches, remember that the Goffman-Newill model, dis- 
cussed here above, presents a connection between the number of scientists 
working in a research area and the number of relevant publications. In [106J, 
it was found that the number of new publications scale as a simple power 
law with the corresponding number of new authors: AP = C(AT) a where 
AP and AT are the new publications and the new authors over some time 
period (for an example one year). C is a normalization constant, and a is a 
scaling exponent. It has been demonstrated |106] that the latter relationship 
provides a very good fit to data for six different research fields, but with 
different values of the scaling exponent a. For a > 1, a field would grow by 
showing an increase in the number of publications per capita, i.e., in such 
a research field, the individual productivity increases as the field attracts 
new scientists. A field with a < 1 has a per capita decrease in productiv- 
ity. This can be a warning signal for a dying subject matter. It would be 
interesting to observe whether the exponent a is time-dependent, as is the 
case in related characterizing scaling exponents of financial markets |148] or 
in meteorology [149J. Policy control can thus be implemented for shaking a, 

2 Let us mention a curious and interesting fact connected to statistical indicators. Very 
interesting is the conclusion in [147] that the scale-independent indicators show that in the 
fast growing innovation system of China, research institutions financed by the government 
play a more important role than the enterprises. 
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thus the field mobility. 



FOR POLICY-MAKERS 
Take away box Nr. 14: 

There exist two different kinds of statistical approaches for the analysis of 
scientific productivity: (i) the frequency approach and (ii) the rank approach. 
The frequency approach is based on the direct statistical counting of the 
number of corresponding information sources, such as scientists or journals. 
The rank approach is based on a ranking of the sources with respect to their 
productivity. The frequency and the rank approaches represent different and 
complementary reflections of the same law and form. 



6.1 Lotka law. Distributions of Pareto and Yule. 

Pareto [150J formulated the 80/20 rule: it can be expected that 20% of people 
will have 80% of the wealth. Or it can be expected that 80% of the citations 
refer to a core of 20% of the titles in journals. The idea of the rule of Pareto 
is very close to the research of Lotka who noticed the following dependence 
for the number of scientists who wrote k papers 

n k = T ^] k = 1, 2, . . . , k max . (64) 



In Eq.(64), n\ is the number of scientists who wrote just one paper and k r , 



is the maximal productivity of a scientist. 

E »* = ni E 7^ = N (65) 
k=i k=i ft 

where N is the total number of scientists. If we assume that k max — > oo 
and take into account the fact that Y^T=i l/^ 2 — tt/6, we obtain a limiting 
value for the portion of scientists with the minimal productivity (single paper 
authors) in the given population of authors: Pi = n\/N ~ 0.6. Then, if the 



left and the right hand sides of Eq.(64) are divided by N, the frequency 
expression for the productivity distribution is: pi = 0.6/k 2 ; Y^k^iPk — 1- 



Eq. (64) is called Lotka law, or the law of inverse squares: the number of 
scientists who wrote a given number of papers is inversely proportional to 
the square of this number of papers. 
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It must be noted that, like many other statistical regularities, Lotka law 



is valid only on the average since the exponent in the denominator of Eq.(64) 
is not necessarily equal to two |141j . Thus, Lotka law should be considered 
as the most typical among a more general family of distributions: 

nk = ¥^ ] Pl = ¥^ (66) 

where a is the characteristic exponent of the distribution, n\ is the normal- 
izing coefficient which is determined as follows: 

ft^HEr^l • (67) 




Then the distribution of scientific output, Eq.(66), is determined by three 
parameters: the proportion of scientists with the minimal productivity pi, 
the maximal productivity of a scientist k max , and the characteristic exponent 
a. If one of these parameters is fixed, it is possible to study the dependence 



between two others. Let us fix k max in Eq.(67). Then, we obtain the propor- 
tion of "single paper authors" pi as a function of a: p\(a). When Eq.(67) is 
differentiated with respect to a, one can show that the corresponding deriva- 
tive is positive for any a : dpi(a) / da > 0. On the basis of a similar analysis 
of the portion of scientists with a larger productivity Pk(ct) as a function of 
a, we arrive at the conclusion: the increase of a is accompanied by the 
increase of low-productivity scientists. This means that when the total 
number of scientists is preserved the portion of highly productive scientists 
will decrease. 

Let us show that the Lotka law is an asymptotic expression for the Yule 
distribution. In order to obtain the Yule distribution, one considers the pro- 
cess of formation of a collection of publications as a Markov-type stochastic 
process. In addition, it is assumed that the probability of writing a new 
paper depends on the number of papers that have been already written by 
the scientist at time t: the probability of the transition into a new state on 
the interval [t, t+At] should be a function of the state in which the system is 
at time t. Moreover, the probability of publishing a new paper during a time 
interval At,p(x — > x + 1, At) is assumed to be proportional to the number x 
of papers that have been written by the scientists, introducing an intensity 
coefficient A: p(x — > x + 1, At) oc XxAt. After solving the corresponding 
system of differential equations for this process, the following expression (the 
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Yule distribution) for the probability p(x/t) of a scientist writing x papers 
during a time t is obtained |141j : 

p(x/t) = exp(-At)(l - exp(-At))*~\ x = 1, 2 . . . (68) 

The mean value of the Yule distribution is x t = exp(At). Let us take into 
account the fact that every scientist works on a given subject during a certain 
finite random time interval [0,t] which depends on the scientist's creative 
potential, the conditions for work, etc. With the simplest assumption that 
the probability of discontinuing work on a given subject is constant at any 
time, one obtains an exponential distribution for the time of work of any 
author in the scientific field under study: pit) = fiexp(—fj,t), where /x is 
the distribution parameter. The time parameter t which characterizes the 



productivity distribution, Eq.(68), is a random number. Then in order to 



obtain the final distribution of scientific output observed in the experiment 
over sufficiently large time intervals, Eq.(68) should be averaged with respect 
to this parameter t which is distributed according to the exponential law: 

/*oo /*oo 

p(x) = / dt p(x/t)p(t) = / dt exp(— At) (1 — exp(— Ai))//exp(— fit). (69) 
Jo Jo 

After integrating Eq.((69|), the distribution of scientific output reads 



p(x) = jB (x, ^ + lj = aB(x, a + 1), x = 1, 2, . . . (70) 

where B(x, a + 1) = T(x)T(ax + 1)/T(x + a + 1) is a Beta-function, T(x) ~ 
(x— 1)! is a Gamma-function, and a = (i/X is the characteristic exponent. For 
instance, if a ~ 1 then p(x) = l/[x(x + 1)]. Let us assume that x — » oo and 
apply the Stirling formula. Thus, the asymptotics of the Yule distribution 



T(a + l)a/x 1+a . 



Eq.(70) is like Lotka law Eq.(66) (up to a normalizing constant): p(x) oc 



6.2 Pareto distribution, Zipf-Mandelbrot and Brad- 
ford laws 

For large enough values of the total number of scientists and the total number 
of publications, we can make the transition from discrete to continuous rep- 
resentation of the corresponding variables and laws. The continuous analog 
of Lotka law, Eq. ( |66| , is the Pareto distribution 

. , a /x \ a+1 . . 

pix) = — — ; x > Xo] a > (71) 
Xq V x ) 



57 



which describes the distribution density for a number of scientists with x pa- 
pers; xo is the minimal productivity xo « x « oo, a continuous quantity. 

Zipf law is connected to the principle of least effort [151]: a person will 
try to solve his problems in such a way as to minimize the total work that he 
must do in the solution process. For example, to express with many words 
what can be expressed with a few is meaningless. Thus, it is important to 
summarize an article using a small number of meaningful words. Bradford 
law for the scattering of articles over different journals is connected to the 
success-breeds-success (SBS) principle [152] : success in the past increases 
chances for some success in the future. For example, a journal that has been 
frequently consulted for some purpose is more likely to be read again, rather 
than one of previously infrequent use. 

In order to obtain the law of Zipf-Mandelbrot, we start from the following 
version of Lotka law : n x = C/(l + x) 1+a , where x is the scientist's produc- 
tivity, a is a characteristic exponent, C is a constant which in most cases is 
equal to the number of authors with the minimal productivity x — 1, i.e., to 
Tii. On the basis of this formula, the number of scientists r who are charac- 
terized by productivity x r < x < k max (k max is the maximal productivity of 
a scientist) reads 



k 



max 




fkmax Ann 

^n r ^C = " JL _ ^_ ) . (72) 



x=x r \ r max 



Depending on the value of have values 1, 2, 3, . . . and in such a way 

the scientists can be ranked. If all scientists of a scientific community working 
on the same topic are ranked in the order of the decrease of their productivity, 
the place of a scientist who has written x r papers will be determined by 



his/her rank r. When the productivity of a scientist x r is found from Eq.(72) 
function of rank r, the relationship 

^=(^) 7 ; A=(C/a) 1 / a ; B = C/(ak^J; 7 = 1/a. (73) 

This is the rank law of Zipf-Mandelbrot, which generalizes Zipf law: f(r) = 
cr~"; r = 1, 2, 3, . . ., where c and (3 are parameters. Zipf law was discovered 
by counting words in books. If words in a book are ranked in decreasing 
order according to their number of occurrences, then Zipf law states that the 
number of occurrences of a word is inversely proportional to its rank r. 

Assuming that in Lotka law the exponent takes the value a = 1 and that 
in most cases C = ni, one has x r = ni/(r + a), where a = ni/k ma x, r > 0. 
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Integration of the last relationship yields the total productivity R(n) of all 
scientists, beginning with the one with the greatest productivity k max and 
ending with the scientist whose productivity corresponds to the rank n (the 
scientists are ranked in the order of diminishing productivity; the rank is 
assumed to be a continuous-like variable): 



This is Bradford law. According to this law, for a given topic, a large number 
of relevant articles will be concentrated in a small number of journals. The 
remaining articles will be dispersed over a large number of journals. Thus, 
if scientific journals are arranged in order of decreasing published articles 
on a given subject, they may be split to a core of journals more particu- 
larly devoted to the subject and a shell consisting of sub-shells of journals 
containing the same numbers of articles as the core. Then the number of jour- 
nals from the core zone and succeeding sub-shells will follow the relationship 
1 : n : n 2 : 



FOR POLICY-MAKERS 
Take away box Nr. 15: 

The Zipf-Pareto law, in the case of the distribution of scientists with respect 
to their productivity, indicates that one can always single out a small number 
of productive scientists who wrote the greatest number of papers on a given 
subject, and a large number of scientists with low productivity. The same 
applies also to scientific contacts, citation networks, etc. This specific feature 
(so-called hierarchical stratification) of the Zipf-Pareto law reflects a basic 
mechanism in the formation of stable complex systems. This can/must be 
taken into account in the process of planning and the organization of science. 



7 Concluding remarks 

Knowledge has a complex nature. It can be created. It can lead to innova- 
tions and new technologies, and on this base, knowledge supports the advance 
and economic growth of societies. Knowledge can be collected. Knowledge 
can be spread. Diffusion of ideas is closely connected to the collection and 




(74) 
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spreading of knowledge. Some stages of the diffusion of ideas can be de- 
scribed by epidemic models of scientific and technological systems. Most of 
the models described here are deterministic, but if the internal and external 
fluctuations are strong, then different kinds of models can be applied taking 
into account stochastic features. 

Much information about properties and stability of the knowledge systems 
can be obtained by the statistical approach on the basis of distributions con- 
nected to the Lotka-Volterra models of diffusion of knowledge. Interestingly, 
new terms occur in the usual evolution equations because of the variability 
and flexibility in the opinions of actors, due to media contacts or interper- 
sonal contacts, when exchanging ideas. 

The inclusion of spatial variables in the models leads to new research 
topics, such as questions on the spreading of systems of ideas and competition 
among ideas in different areas/countries. 

In conclusion, the epidemiological perspective renders a piece of mosaic 
to a better understanding of the dynamics of diffusion of ideas in science, 
technology, and society, which should be one of the main future tasks of the 
science of science |153j . 
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